Low-latency Network Monitoring via Oversubscribed Port Mirroring
نویسندگان
چکیده
Introduction Modern networks operate at a speed and scale that make it impossible for human operators to manually respond to transient problems, e.g., congestion induced by workload dynamics. Even reacting to issues in seconds can cause significant disruption, so network operators overprovision their networks to minimize the likelihood of problems. Software-defined networking (SDN) introduces the possibility of building autonomous, self-tuning networks that constantly monitor network conditions and react rapidly to problems. Previous work has demonstrated that new routes can be installed by an SDN controller in tens of milliseconds [10], but state-of-the-art network measurement systems take hundreds of milliseconds or more to collect a view of current network conditions [1, 2, 3, 4, 11]. To support future autonomous SDNs, a much lower latency network monitoring mechanism is necessary, especially as we move from 1 Gb to 10 Gb and 40 Gb links, which require 10x and 40x faster measurement to detect flows of the same size. We believe that networks need to, and can, adapt to network dynamics at timescales closer to milliseconds or less. This paper introduces Planck, a network measurement architecture that provides statistics on 1 Gb and 10 Gb networks at 3.5–6.5 ms timescales, more than an order of magnitude improvement over the state-of-the-art. Planck does so with a novel use of the well-known port mirroring mechanism: it mirrors all traffic traversing a switch to a small number—one in our current system—of oversubscribed monitor ports. When more traffic is sent to the monitor ports than they can handle, traffic is dropped, resulting in the monitor port emitting a “random” sample of all traffic on the switch. As this is done at line rate in the data path, Planck collects orders of magnitude more samples per second than is possible using previous monitoring mechanisms. In addition to supporting sFlow-style [9] sampling, Planck provides extremely low latency link utilization and flow rate estimates as well as alerts when congestion is detected. A Planck collector, consisting of software running on a commodity server, receives these samples and performs lightweight analysis to provide measurement data to those interested, e.g., an SDN controller. Design Planck consists of three main components which can be seen in Figure 1: (i) switches configured to provide samples at high rates, (ii) a set of collectors which process those samples and turn them into events and queryable data, and (iii) a controller which can act on those events and data. This paper focuses on the first two. Fast Sampling at Switches Planck is inspired by sFlow [9], which is designed to randomly sample and forward packets and provide them to a collector in real time. However, sFlow samples are typically processed by switch control planes which limit the rate of samples as show in Figure 1(a). For example, the IBM G8264 produces at most 350 samples per second, and thus obtaining an accurate view of the network takes seconds or longer. Counter polling, as in OpenFlow, also uses the control plane and suffers similar problems. We overcome this by leveraging the port mirroring feature found in most commodity switches today that is traditionally used for diagnosing problems. Port mirroring allows for all—or a subset—of traffic received on a given input port to be both forwarded normally and also copied and sent out a mirror port. Importantly, this is done at line-rate in the data plane allowing for as many mirrored packets as the mirror port has bandwidth to carry. We repurpose this functionality to efficiently produce samples by oversubscribing the mirror port(s). Oversubscription causes the output buffer for the mirror port to eventually fill and then drop packets, constraining the samples to the bandwidth of the output port. We typically allocate 1 port of an N port switch to be the mirror port and the remaining N − 1 ports to handle data traffic. This gives a sampling rate expected to be about 1 in N − 1. To study the latency of the samples received by the collector we ran experiments under low and high congestion. The low congestion experiment sent a single line-rate TCP flow and measured the latency between when tcpdump reported the packet sent and when the collector received it. This latency was between 75–150μs. The high congestion experiment introduced 2 additional line-rate flows that do not contend for resources except the mirror port. This caused the mirror port to become congested and fill its buffer space, which maximized the latency of our samples. Figure 2 shows a delay of about 3.5 ms on an IBM Rackswitch G8264 (10 Gb) and about 6 ms on a Pronto 3290 (1 Gb) switch. Collector The collector has four goals: (i) process sampled packets at line rate, (ii) infer the input and output ports for each packet, (iii) determine flow rates and link utilization, and (iv) answer queries about the state of the network.
منابع مشابه
ARÃ : A User-Centric Mirror Locating Service for the World Wide Web
Information sharing via the Web has risen to dominance in the last few years due to applications such as cooperative engineering and network games. The latters and likewise applications’ viability rely on high availability of shared data and low access latency to it. Mirroring is one of the techniques to make such requirements possible; for that to work, users should be able to locate mirrors. ...
متن کاملFlexible Network Bandwidth and Latency Provisioning in the Datacenter
Predictably sharing the network is critical to achieving high utilization in the datacenter. Past work has focussed on providing bandwidth to endpoints, but often we want to allocate resources among multi-node services. In this paper, we present Parley, which provides service-centric minimum bandwidth guarantees, which can be composed hierarchically. Parley also supports service-centric weighte...
متن کاملAutomated Asset Discovery in Industrial Control Systems - Exploring the Problem
Vulnerabilities within Industrial Control Systems (ICS) and Critical National Infrastructure (CNI) represent a significant safety, ecological and economical risk to owners, operators and nation states. Numerous examples from recent years are available to demonstrate that these vulnerabilities are being exploited by threat actors. One of the first steps required when securing legacy infrastructu...
متن کاملA Fast Mach Network IPC Implementation
This paper describes an implementation of network Mach IPC optimized for clusters of processors connected by a fast network, such as workstations connected by an Ethernet or processors in a non-shared memory multiprocessor. This work contrasts with earlier work, such as the netmsg server, which has emphasized connectivity (by using robust and widely available protocols such as TCP/IP) and con g...
متن کاملDesign of a Low-Latency Router Based on Virtual Output Queuing and Bypass Channels for Wireless Network-on-Chip
Wireless network-on-chip (WiNoC) is considered as a novel approach for designing future multi-core systems. In WiNoCs, wireless routers (WRs) utilize high-bandwidth wireless links to reduce the transmission delay between the long distance nodes. When the network traffic loads increase, a large number of packets will be sent into the wired and wireless links and can...
متن کامل